Tao Zhang, Central Michigan University, zhang3t@cmich.edu
PRIMARY
Qi Liao, Central Michigan University, qi.liao@cmich.edu
Lei Shi, Institute of Software, Chinese Academy of
Sciences, - China, shijim@gmail.com
Student Team: YES
Google Earth, developed by Google Inc.,
http://www.google.com/earth/index.html
Python, http://www.python.org/
3DBarModelGenerator4Kml, developed by Zhang, CMICH
Video:
Answers to Mini-Challenge 1 Questions:
MC 1.1 Create a
visualization of the health and policy status of the entire Bank of Money
enterprise as of 2 pm BMT (BankWorld Mean Time) on
February 2. What areas of concern do you observe?
The following 3 snapshots (Figure 1.1-1.3)
shows the distribution of all machines’ policy and activity status
aggregated by 50 regions (40 small, 10 large and the
headquarter). The yellow 3D bars represent the sum of all machines'
policy status in each region, by the bar height, or say, altitude. Each single
policy value at a machine is rescaled by a minus of one, so that a policy
status of ‘1’ contributes a zerio bar
height, larger policy value contributes positive values. The 3D bars in red
represent the sum of all machines' activity flags in each region. Each activity
flag at a machine with a value of '1' ('normal') contributes a zero value,
otherwise the activity flags larger than ‘1’ will contribute a
‘1’ value to the summed activity.)
This visualizations show an overview picture
at one time by comparing every region's abnormalty
degree as different 3D bars. There are three areas we
may take special concerns from the visualization. The first area is 'Hatt', region 10, which has the highest policy status bar.
The second area, 'Lomu', region 5, was chosen by the same
reason: it has the second highest policy status bar in the bank of world. The
last area is ‘Zizzer’, the region '
headquarter '. These three regions' policy status bars show big positive bias
compared to other regions. The headquarter's activity flag bar also shows a big plus in height which
worth investigating.
Figure 1.1: region 10, the Hatt.
Figure
1.2: region 5, the Lomu.
Figure
1.3: headquarter, the Zizzer
MC 1.2 Use your
visualization tools to look at how the network’s status changes over
time. Highlight up to five potential anomalies in the network and provide a
visualization of each. When did each anomaly begin and end? What might be an
explanation of each anomaly?
Again, our visualization use region as the
aggregation granularity. By using BMT time as another dimension, we obtain a
time serie of status distributions, one per 15
minutes. But this time, we only show one bar per region, corresponding to either the connnection, activity
or policy sum. The calculations for the three attributes (numConnections,
policyStatus, activityFlag)
are as such: A mean numConnection is used by dividing
the sum value by the region’s machine number. The policyStatus
have 5 values which indicate how serious the policy deviating from the machine
is undergoing, from 1 (normal) to 5 (very dangerous). In order to emphasize the
abnormalities, the policyStatus value is minus by 1
before summed together, so that the normal machine’s policy
(‘1’) will not be counted. The activityFlag
attribute have 5 values. Value ‘1’ means working normally, value
’2 - 4’ mean different abnormal activities on one machine. All the
‘2-4’ values worth investigating so that the value of
‘1’ is counted as ‘0’ and all the other value as
‘1’. After this calculation, the summed value will let us know how
many abnormal machines are in the region. Movie visualization is used to
connecting the dots between timelines.
After data preparation, we leverage a Kml file generator written by python, to collect each
region’s information (location, attributes, time and so on) and generate Kml files. These Kml files can be
viewed by GIS Systems such as Google Earth.
The reason for grouping by regions but not by
a more detailed unit (e.g. branch) is that, using branch will generate more
than 4000 3D bars, which will hardly be displayed and recognized by human
beings. But grouping by branch will be an appropriate way to represent more
details when studying only one region.
Anomaly #1: activity flag change anomalously
Anomaly location:
the large regional offices (region 1 – 10 and headquarter)
Anomaly time:
11am - 3am(next day), everyday
Anomaly degree:
during this time, the activityflag value for each large
regional office rises up with a big plus compared to the small regional offices
around them. And this higher activityflag phenomenon
lasts until 3am each day.
Anomaly explanation: By our visualization, activityflag
bar's height represents the number of the abnormal machines. The high frequency
of abnormal situation might because of the higher usage of the machine in the
large regional offices. The reason why issues last till 3am each day might
because the maintenance will be able to apply on a daily time.
Figue 2.1. 11am activityflag
Figure 2.2.
5.22pm, activity flag
Figure 2.3 3am, the activity flag
Anomaly #2: policy status change anomaly
Anomaly location: large regional offices (region 1 – 10 and
headquarter)
Anomaly time:
keep rising along with the time
Anomaly degree: the
region - 5 and region - 10 start with a higher policy status bar. During the
log data’s time, all policystatus value for
each large regional office rises up with a big plus compare to the small
regional offices around them. All the bars almost keep rising until the end of
time. The headquarter 's policystatus
bar is the highest one at the end of the time.
Anomaly explanation:
Through our visualization, the policy status
bar's height represents how severe a policy issue a machine has. A higher value
might indicate the region is being attacked. The rising patterns of the policy
status can be a critical issue for BoM, especially in headquarters. It seems
that most policy deviation warnings still exist in the end, while the sum of policystatus keeps increasing. But it is only one
assumption. Another possibility might be the policyStatus’s
increase happens in all large regional offices, because the large regional
offices are common targets for various intrusion attempts.
Figure 2.4.
Policy status at the beginning of time
Figure 2.5.
Policy status keep rising in the middle of time
Figure 2.6.
Headquarters with the highest policy status in the end